HappyDB is a corpus of 100,000 crowd-sourced happy moments. The goal of the corpus is to advance the state of the art of understanding the causes of happiness that can be gleaned from text.
Here, I explore this data set and try to answer the question, “What makes people happy?”
From the packages’ descriptions:
tidyverse is an opinionated collection of R packages
designed for data science. All packages share an underlying design
philosophy, grammar, and data structures;tidytext allows text mining using ‘dplyr’, ‘ggplot2’,
and other tidy tools;ggplot2 produces statistical, or data, graphics;wordcloud provides a visual representation of text
data;ngram is for constructing n-grams (“tokenizing”), as
well as generating new text based on the n-gram structure of a given
text input (“babbling”);word2vec learns vector representations of words by
continuous bag of words and skip-gram implementationsFirst,I process the raw textual data ‘cleaned_hm.csv’ saved in $ data $ file by cleaning data, removing stopwords and creating a tidy version of texts which is saved in $ output $ file.
I choose a subset of the data out of the tidy version of texts and keep the required columns for analysis.
## [1] 94506
## [1] 38707
## [1] 55799
Among the 94,506 curated happy moments, 38,707 were shared by women, and 55,799 were shared by men. The pie chart demonstrates a significant majority of these joyful moments originate from the United States, followed by India, with the rest of the world contributing fewer entries to this specific dataset. The category plot provides a clear view that most happy moments fall under the categories of affection and achievement. This observation aligns with expectations, as these two categories constitute the largest share of happy moments. For instance, categories such as nature and exercise might not resonate as happy moments for individuals (like myself) who do not particularly enjoy going outdoors or find exercise tiresome. Exploring the words associated with each category could offer deeper insights into what precisely brings individuals happiness in these respective categories.
TF-IDF (Term Frequency-Inverse Document Frequency) is a technique that evaluates word importance in a document based on their frequency. In my analysis, I focused on the most frequent words within the categories of achievement and affection, considering these categories as central to most happy moments. In the achievement category, prevalent words like “won,” “job,” and “bonus” stood out. Conversely, in the affection category, terms closely associated with significant relationships such as “daughter” and “family” were the most commonly mentioned. I found a strong personal connection with these words and recognized their significance in describing numerous happy moments.
Now, let’s explore the distinctions between men and women, as personal experiences suggest that what brings happiness to girls may not necessarily elicit the same level of excitement in boys. Is there indeed a disparity, and if so, how significant is it?
The histogram clearly indicates that women experience the majority of their happy moments in the realm of affection, while for men, their peak happy moments are linked to achievements. This sheds light on the potential reason behind the higher levels of competitiveness observed in boys, highlighted by the highest TF-IDF within the achievement category, appears to significantly amplify their happiness.
Given that affection emerges as the primary focus for women, a detailed examination of word frequency within the top 10 words (excluding “home” and “love”) reveals that all the notable terms are related to family members. For women, the presence of loved ones is paramount for experiencing happiness. As for men, what contributes most to their happiness?
On the flip side, in the case of men, my focus was directed towards analyzing word frequency related to achievements, as it emerged as their primary category for happy moments. Notably, words like “car,” “job,” and “money” stood out, emphasizing possessions or financial aspects. However, the majority of words were action-oriented (e.g., “bought,” “received”), highlighting the significance of accomplishing meaningful actions for men in experiencing happiness. In contrast, for women, the emphasis on personal achievements appears to be less prominent when reflecting on their happy moments.
Does this imply that men only seek fulfillment through work? Probably not, considering that affection is also a significant category for them. Therefore, perhaps we should investigate who the individuals are that contribute to making men feel happy.
## $friend
## term1 term2 similarity rank
## 1 friend friendship 0.9389431 1
## 2 friend hang 0.9215116 2
## 3 friend conversation 0.9207855 3
## 4 friend fun 0.9159433 4
## 5 friend engaged 0.9138008 5
## 6 friend family 0.9066549 6
## 7 friend forget 0.9065868 7
## 8 friend invited 0.9038376 8
## 9 friend fuzzy 0.9033049 9
## 10 friend enjoyed 0.9027944 10
## $friend
## term1 term2 similarity rank
## 1 friend fun 0.9560235 1
## 2 friend gap 0.9518284 2
## 3 friend guest 0.9431605 3
## 4 friend friendship 0.9395139 4
## 5 friend invited 0.9292281 5
## 6 friend gathering 0.9274477 6
## 7 friend fond 0.9266509 7
## 8 friend guard 0.9247046 8
## 9 friend goodbye 0.9235395 9
## 10 friend interact 0.9207920 10
Analyzing the word clouds, a standout term for both genders is “friend.” Through embeddings, we find that “friend” is strongly correlated with words like “family,” “familyas,” and “fun.” However, for men, it also correlates with “gap” and “guard.” This suggests that for women, the association between “friend” and “family” is prominent, possibly indicating a close familial bond. Interestingly, both genders share a strong association between “friend” and “fun,” with a similarity score of 0.92 for women and 0.96 for men, signifying a common joy derived from companionship.
Now, let’s delve into the specific activities that each gender enjoys with their friends, bringing the most delight.
Except for special events like “birthdays,” women generally lean towards activities involving “surprises” and “celebrations” with their friends, whereas men typically find enjoyment in “playing” and participating in “games.” However, both genders equally savor “dinners” and “parties.” In the word cloud, exclusive words used by women are highlighted in red, while those used solely by men are depicted in blue.
Using bigrams, the result also supports the same finding from above.
Below are some happy moments that related to dinner and party for women:
## # A tibble: 2 × 1
## original_hm
## <chr>
## 1 I made dinner for my boyfriend and he complimented me on it. I felt happy for…
## 2 I was happy because I went to dinner last night with friends I hadn't seen mo…
## # A tibble: 2 × 1
## original_hm
## <chr>
## 1 "we went for birthday party of my son's friend at today 11.30 . I talked wit…
## 2 "I GOT GOVERNMENT JOB\r\nFRIENDS MEETING\r\nFRIENDS GET TO GATHER PARTY\r\nI'…
Below are some happy moments that related to dinner and party for men:
## # A tibble: 2 × 1
## original_hm
## <chr>
## 1 I took my girlfriend out for a nice Birthday dinner and got to watch her open…
## 2 I went out for dinner with my friends and enjoyed a lot.
## # A tibble: 2 × 1
## original_hm
## <chr>
## 1 My friends thrown a surprise party for my promotion yesterday, so happy.
## 2 partying all night with some old friends was one of the happy moments
Given that a majority of the entries are from the USA, I am intrigued to understand whether the sources of happiness are similar between the rest of the world and us within this state.
Analyzing word clouds, it’s evident that the term “friend” is consistently the most prevalent regardless of the geographic region. The moments spent with friends not only bring happiness to us but resonate as joyful experiences for people across different parts of the world.
The word cloud displays words exclusively extracted from entries outside the United States. A notable observation is the conspicuous absence of terms representing particular activities. This intriguingly indicates a common trend: people in various countries participate in analogous activities with their friends to attain happiness.
This pattern underscores a universal inclination toward certain shared pursuits that foster joy and fulfillment among diverse cultures. Despite geographical and cultural differences, the essence of bonding with friends through similar activities appears to be a fundamental element of human happiness transcending borders.
Below are some happy moments that associated with friend in US:
## # A tibble: 5 × 1
## original_hm
## <chr>
## 1 "We were competing against another team in a online game, I was playing with …
## 2 "when i received flowers from my best friend"
## 3 "I saw two close friends that I haven't seen for a couple months."
## 4 "I cooked my girlfriend a wonderful breakfast."
## 5 "My girlfriend told me she loved me and that she hoped to marry me one day."
Below are some happy moments that associated with friend not in US:
## # A tibble: 5 × 1
## original_hm
## <chr>
## 1 We had a serious talk with some friends of ours who have been flaky lately. T…
## 2 went to movies with my friends it was fun
## 3 A hot kiss with my girl friend last night made my day
## 4 We celebrated our anniversary with my boyfriend.
## 5 my son went to football match with his friends, he came home very happy that …
People commonly find their sources of happiness either through achieving personal objectives or through experiencing affection. It’s worth noting that men tend to experience happiness by completing tasks or reaching milestones, while women often find happiness through meaningful relationships and connections.
When describing their moments of happiness, men often employ action-oriented verbs, accentuating their accomplishments. Conversely, women commonly link individuals or relationships with the actions that lead to their happiness.
The term “friend” stands out in narratives of happiness. Both men and women value the companionship of friends; however, their preferred activities with friends diverge. Men generally relish “game,” whereas women derive joy from “celebration” and engaging in “surprise” activities. A commonality among both groups is their enjoyment of shared meals and enjoying birthdays with their friends.
Regardless of where people are situated, individuals universally value and treasure spending meaningful time with their friends, and the activities they partake in together demonstrate striking similarities.